NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MITgcm-AD v2: Open source tangent linear and adjoint modeling framework for the oceans and atmosphere enabled by the Automatic Differentiation tool Tapenade

https://doi.org/10.1016/j.future.2024.107512

Gaikwad, Shreyas Sunil; Narayanan, Sri_Hari Krishna; Hascoët, Laurent; Campin, Jean-Michel; Pillar, Helen; Nguyen, An; Hückelheim, Jan; Hovland, Paul; Heimbach, Patrick (February 2025, Future Generation Computer Systems)

Full Text Available
Parametric Sensitivities of a Wind-driven Baroclinic Ocean Using Neural Surrogates

https://doi.org/10.1145/3659914.3659920

Sun, Yixuan; Cucuzzella, Elizabeth; Brus, Steven; Narayanan, Sri_Hari Krishna; Nadiga, Balasubramanya; Van_Roekel, Luke; Hückelheim, Jan; Madireddy, Sandeep; Heimbach, Patrick (June 2024, ACM)

Full Text Available
Model Checking Race-Freedom When “Sequential Consistency for Data-Race-Free Programs” is Guaranteed

https://doi.org/10.1007/978-3-031-37703-7_13

Wu, Wenhao; Hückelheim, Jan; Hovland, Paul D.; Luo, Ziqing; Siegel, Stephen F. (July 2023, International Conference on Computer Aided Verification)
Enea, Constantin; Lal, Akash (Ed.)
Many parallel programming models guarantee that if all sequentially consistent (SC) executions of a program are free of data races, then all executions of the program will appear to be sequentially consistent. This greatly simplifies reasoning about the program, but leaves open the question of how to verify that all SC executions are race-free. In this paper, we show that with a few simple modifications, model checking can be an effective tool for verifying race-freedom. We explore this technique on a suite of C programs parallelized with OpenMP.
more » « less
Full Text Available
Verifying Fortran Programs with CIVL

https://doi.org/10.1007/978-3-030-99524-9_6

Wu, Wenhao; Hückelheim, Jan; Hovland, Paul D.; Siegel, Stephen F. (March 2022, TACAS 2022: Tools and Algorithms for the Construction and Analysis of Systems)
Fisman, Dana; Rosu, Grigore (Ed.)
Fortran is widely used in computational science, engineering, and high performance computing. This paper presents an extension to the CIVL verification framework to check correctness properties of Fortran programs. Unlike previous work that translates Fortran to C, LLVM IR, or other intermediate formats before verification, our work allows CIVL to directly consume Fortran source files. We extended the parsing, translation, and analysis phases to support Fortran-specific features such as array slicing and reshaping, and to find program violations that are specific to Fortran, such as argument aliasing rule violations, invalid use of variable and function attributes, or defects due to Fortran's unspecified expression evaluation order. We demonstrate the usefulness of our tool on a verification benchmark suite and kernels extracted from a real world application.
more » « less
Full Text Available
Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation

https://doi.org/10.1109/SC41404.2022.00065

Moses, William S.; Narayanan, Sri Hari; Paehler, Ludger; Churavy, Valentin; Schanen, Michel; Hückelheim, Jan; Doerfert, Johannes; Hovland, Paul (November 2022, IEEE)

Full Text Available
Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation

https://doi.org/10.1109/SC41404.2022.00065

Moses, William S; Narayanan, Sri Hari; Paehler, Ludger; Churavy, Valentin; Schanen, Michel; Hückelheim, Jan; Doerfert, Johannes; Hovland, Paul (November 2022, International Conference for High Performance Computing Networking Storage and Analysis)

Derivatives are key to numerous science, engineering, and machine learning applications. While existing tools generate derivatives of programs in a single language, modern parallel applications combine a set of frameworks and languages to leverage available performance and function in an evolving hardware landscape. We propose a scheme for differentiating arbitrary DAG-based parallelism that preserves scalability and efficiency, implemented into the LLVM-based Enzyme automatic differentiation framework. By integrating with a full-fledged compiler backend, Enzyme can differentiate numerous parallel frameworks and directly control code generation. Combined with its ability to differentiate any LLVM-based language, this flexibility permits Enzyme to leverage the compiler tool chain for parallel and differentiation-specific optimizations. We differentiate nine distinct versions of the LULESH and miniBUDE applications, written in different programming languages (C++, Julia) and parallel frameworks (OpenMP, MPI, RAJA, Julia tasks, MPI.jl), demonstrating similar scalability to the original program. On benchmarks with 64 threads or nodes, we find a differentiation overhead of 3.4 - 6.8× on C++ and 5.4 - 12.5× on Julia.
more » « less
Full Text Available
Reverse-mode automatic differentiation and optimization of GPU kernels via enzyme

https://doi.org/10.1145/3458817.3476165

Moses, William S.; Churavy, Valentin; Paehler, Ludger; Hückelheim, Jan; Narayanan, Sri Hari; Schanen, Michel; Doerfert, Johannes (November 2021, SC '21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis)

Computing derivatives is key to many algorithms in scientific computing and machine learning such as optimization, uncertainty quantification, and stability analysis. Enzyme is a LLVM compiler plugin that performs reverse-mode automatic differentiation (AD) and thus generates high performance gradients of programs in languages including C/C++, Fortran, Julia, and Rust. Prior to this work, Enzyme and other AD tools were not capable of generating gradients of GPU kernels. Our paper presents a combination of novel techniques that make Enzyme the first fully automatic reversemode AD tool to generate gradients of GPU kernels. Since unlike other tools Enzyme performs automatic differentiation within a general-purpose compiler, we are able to introduce several novel GPU and AD-specific optimizations. To show the generality and efficiency of our approach, we compute gradients of five GPU-based HPC applications, executed on NVIDIA and AMD GPUs. All benchmarks run within an order of magnitude of the original program's execution time. Without GPU and AD-specific optimizations, gradients of GPU kernels either fail to run from a lack of resources or have infeasible overhead. Finally, we demonstrate that increasing the problem size by either increasing the number of threads or increasing the work per thread, does not substantially impact the overhead from differentiation.
more » « less
Full Text Available

Search for: All records